Modeling Information Flow Through Deep Neural Networks

نویسندگان

  • Ahmad Chaddad
  • Behnaz Naisiri
  • Marco Pedersoli
  • Eric Granger
  • Christian Desrosiers
  • Matthew Toews
چکیده

This paper proposes a principled information theoretic analysis of classification for deep neural network structures, e.g. convolutional neural networks (CNN). The output of convolutional filters is modeled as a random variable Y conditioned on the object class C and network filter bank F . The conditional entropy (CENT) H(Y |C,F ) is shown in theory and experiments to be a highly compact and class-informative code, that can be computed from the filter outputs throughout an existing CNN and used to obtain higher classification results than the original CNN itself. Experiments demonstrate the effectiveness of CENT feature analysis in two separate CNN classification contexts. 1) In the classification of neurodegeneration due to Alzheimer’s disease (AD) and natural aging from 3D magnetic resonance image (MRI) volumes, 3 CENT features result in an AUC=94.6% for whole-brain AD classification, the highest reported accuracy on the public OASIS dataset used and 12% higher than the softmax output of the original CNN trained for the task. 2) In the context of visual object classification from 2D photographs, transfer learning based on a small set of CENT features identified throughout an existing CNN leads to AUC values comparable to the 1000-feature softmax output of the original network when classifying previously unseen object categories. The general information theoretical analysis explains various recent CNN design successes, e.g. densely connected CNN architectures, and provides insights for future research directions in deep learning.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploiting Depth and Highway Connections in Convolutional Recurrent Deep Neural Networks for Speech Recognition

Deep neural network models have achieved considerable success in a wide range of fields. Several architectures have been proposed to alleviate the vanishing gradient problem, and hence enable training of very deep networks. In the speech recognition area, convolutional neural networks, recurrent neural networks, and fully connected deep neural networks have been shown to be complimentary in the...

متن کامل

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...

متن کامل

A Laboratory Study on Stress Dependency of Joint Transmissivity and its Modeling with Neural Networks, Fuzzy Method and Regression Analysis

Correct estimation of water inflow into underground excavations can decrease safety risks and associated costs. Researchers have proposed different methods to asses this value. It has been proved that water transmissivity of a rock joint is a function of factors, such as normal stress, joint roughness and its size and water pressure therefore, a laboratory setup was proposed to quantitatively m...

متن کامل

Training Very Deep Networks

Theoretical and empirical evidence indicates that the depth of neural networks is crucial for their success. However, training becomes more difficult as depth increases, and training of very deep networks remains an open problem. Here we introduce a new architecture designed to overcome this. Our so-called highway networks allow unimpeded information flow across many layers on information highw...

متن کامل

معرفی شبکه های عصبی پیمانه ای عمیق با ساختار فضایی-زمانی دوگانه جهت بهبود بازشناسی گفتار پیوسته فارسی

In this article, growable deep modular neural networks for continuous speech recognition are introduced. These networks can be grown to implement the spatio-temporal information of the frame sequences at their input layer as well as their labels at the output layer at the same time. The trained neural network with such double spatio-temporal association structure can learn the phonetic sequence...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1712.00003  شماره 

صفحات  -

تاریخ انتشار 2017